High Speed Clock Distribution Transmission Line Network

ABSTRACT

The invention is directed to a method for clock distribution and VLSI circuits include a clock distribution network. In a method of the invention, a transmission lines are patterned as to connect a clock tree and a periodic waveform clock, preferably a sine waveform, is used to control clock skew, even at frequencies extending into the gigahertz range. In an exemplary embodiment of the invention, an overlay includes differential pairs of transmission lines that connect the drivers of a clock distribution tree. In preferred embodiments of the invention, an H-tree clock distribution scheme is overlayed with a spiral of transmission lines, each realized by a differential conductors and driven using a sinusoidal standing wave to distribute global clock signals into local regions of the chip. Each transmission line connects drivers in the H-tree that are at the same level of the H-tree. In a VLSI chip according to an embodiment of the invention, the transmission line overlay delivers sinusoidal clock signals to local areas that are locally converted into digital clock signals. The invention thus presents a passive technique for clock distribution.

PRIORITY CLAIM

This application claims priority under 35 U.S.C. §119 from priorapplication Ser. No. 06/573,922, filed May 25, 2004.

STATEMENT OF GOVERNMENT INTEREST

The invention was made with Government assistance under grant numberCCR9987678 awarded by the National Science Foundation. The Governmenthas certain rights in this invention.

FIELD OF THE INVENTION

The field of the invention is VLSI (very large scale integrated)devices, e.g. microprocessors.

BACKGROUND ART

Commercial microprocessors currently operate on clock signals in thegigahertz range. The scale of today's VLSI designs requires the designsto account for clock skew. Clock skew is the relative difference in timethat the clock signal reaches different parts of the integrated circuit.In a microprocessor, for example, a global clock signal must bedistributed to different parts of the chip. This internal clock signalmust be distributed to a large number of clock pins. As clockfrequencies increase, the skew can be a limiting factor. With increasingclock frequency, the clock skew caused by many nondeterministic factorssuch as process variations, supply voltage fluctuation and temperaturegradient consumes a significant portion of clock period. For highperformance synchronous circuitry, the design of a robust global clockdistribution system which can sustain various parameter variationsbecomes an increasingly difficult and time-consuming task.

As a result, reducing clock skew is a goal in the art. RC shuntednetworks have been successfully used to reduce the clock skew underprocess variations. Three wide spine shunts have been proposed to reducethe skew between the leaf nodes of a very deep driver tree. See, e.g.,N. A. Kurd, et al, “A Multigigahertz Clocking Scheme for the Pentium® 4Microprocessor,” IEEE Journal of Solid-State Circuits, Vol. 36, No. 11,November 2001 pp. 1647-53. Others have proposed a clock mesh driven bybalanced H-tree for global clock distribution. See, e.g., M. Orshansky,L. Milor, P. Chen, K. Keutzer and C. Hu, Impact of Spatial IntrachipGate Length Variability on the Performance of High-Speed DigitalCircuit, IEEE trans. on CAD, p. 544-553, vol. 21, No. 5, May 2002.

However, when the clock frequency increases to multi-giga hertz range,the inductance effect of the shunt wires becomes significant. Clockmeshes are used in the industry to reduce skew. Clock meshes form an RCwire network. The inductance effect of the RC network is ignored atclock frequencies of present commercial chips, e.g., the 4 GHz Pentium4. However, the trend is toward higher clock frequencies at which theinductance effect can no longer be ignored. Additionally, for example,at a 10 GHz clock rate, the time of flight between two corners of a chipis comparable to the clock cycle. The RC model of the shunt effect isnot valid at such frequencies. The inductance of the shunt can evencause worse skew.

Active circuits have been proposed to address clock skew. Particularexamples include the following. Phase detectors and coupled oscillatorshave been proposed with shunts of less than a quarter wavelength to lockthe oscillators together. See, Galton et al, “Clock Distribution UsingCoupled Oscillators,” Proc. of ISCAS 1996, vol. 3, pp. 217-220. Activefeedback with phase detectors and distributed phase locked loops havealso been proposed. Gutnik and Chandraksan, “Active GHz Clock NetworkUsing Distributed PLLs,” IEEE Journal of Solid-State Circuits, pp.1553-1560, vol. 35, No. 11, November 2000. Combined clock generation anddistribution using standing wave oscillators has been proposed. O'Mahonyet al. “Design of a 10 GHz Clock Distribution Network Using CoupledStanding-Wave Oscillators,” Proc. of DAC, pp. 682-687, June 2003. Thiswork distributes sine waves, as opposed to the conventional approach ofdistributing square waves. However, the distribution scheme of O'Mahonyet al. does not use a global clock source. Instead, clocks aregenerating locally and distributed. Wood, et al., “Rotary Traveling-WaveOscillator Arrays: A New Clock Technology” IEEE JSSC, pp. 1654-1665,November 2001. The use of active components may be successful toovercome clock skew at high clock frequencies. Compared to a passivescheme, though, the active component approach raises stability issuesand, in some cases, may be more sensitive to process variations duringfabrication.

SUMMARY OF THE INVENTION

The invention is directed to a method for clock distribution and VLSIcircuits include a clock distribution network. In a method of theinvention, a transmission lines are patterned as to connect a clock treeand a periodic waveform clock, preferably a sine waveform, is used tocontrol clock skew, even at frequencies extending into the gigahertzrange. In an exemplary embodiment of the invention, an overlay includesdifferential pairs of transmission lines that connect the drivers of aclock distribution tree. In preferred embodiments of the invention, anH-tree clock distribution scheme is overlayed with a spiral oftransmission lines, each realized by a differential conductors anddriven using a sinusoidal standing wave to distribute global clocksignals into local regions of the chip. Each transmission line connectsdrivers in the H-tree that are at the same level of the H-tree. In aVLSI chip according to an embodiment of the invention, the transmissionline overlay delivers sinusoidal clock signals to local areas that arelocally converted into digital clock signals. The invention thuspresents a passive technique for clock distribution. The technique isrobust, as the differential transmission lines are relativelyinsensitive to process variations. For example, when the lines arefurther apart capacitance increases while inductance decreases,providing a form of self-compensation responsive to process variations.

In a preferred H tree embodiment overlayed with a spiral set oftransmission lines, each level in the H-tree is connected a transmissionline. In the overlay, shorter spiral transmission lines may be madewider, and become gradually thinner in the longer sets of transmissionlines in the spiral. The geometry of the network of transmission lineswill be dictated by the nature of the clock network that isinterconnected by the transmission lines, and the H-tree—spiraltransmission line embodiment presents an example that will beappreciated by artisans to vary consistently with a clock tree having adifferent shape.

Embodiments of the invention also include optimized clock distributionnetworks. The invention presents a method to identify optimal totaltransmission line areas for single level and multiple level transmissionline clock distribution networks.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 (prior art) is a block diagram of a clock driver that may be usedlocally in a clock distribution network of the invention to convert adistributed sine wave clock signal to a square wave for local registersin a VLSI circuit;

FIGS. 2A and 2B are schematic diagrams illustrating a preferredembodiment clock distribution circuit of the invention, with FIG. 2Aillustrating an H-tree clock distribution network and FIG. 2Billustrating a hierarchical transmission line shunt network to shuntclock drivers in the H-tree clock distribution network of FIG. 2A;

FIG. 3 is a partial view of a pair of transmission lines used in thetransmission line shunt network of FIG. 2B;

FIG. 4 illustrates the local distribution of a clock signal from alowest level clock driver in the clock distribution circuit of FIGS. 2Aand 2B;

FIG. 5 is a simplified circuit diagram of two clock drivers from thecircuit of FIGS. 2A and 2B and a transmission line shunt from the shuntnetwork of FIG. 2B;

FIG. 6 shows simulated wave forms for the circuit model of FIG. 5

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The invention provides clock distribution methods and circuits a hybridstructure of a clock distribution tree, e.g., an H-tree, and adifferential transmission line shunt to shunt a level of the clockdistribution tree, or more preferably, multiple differentialtransmission line shunts to shunt multiple levels of the clockdistribution tree. The clock is distributed as differential signals ofperiodic waves, e.g., sinusoidal waves. Even at high frequencies, e.g.,10 GHz and higher, the clock distribution method of the inventionprovides an output to levels of the clock distribution tree thatexhibits very small skew. In a VLSI circuit of the invention, asquare-wave clock signal is recovered locally and provided to registersall over the circuit.

In a preferred embodiment, an H-tree clock distribution circuit isshunted by transmission lines. The transmission lines are driven atdiscrete points and bent into spiral pattern in order to link the clockdrivers of the H-tree clock distribution network. The clock drivers ofthe H-tree are shunted level by level. The shunt lengths between theclock drivers are an integral multiple of wavelength. For an ideal casethat the line is lossless, a standing wave can lock the clock drivers tozero skew. For lossy shunts, embodiments of the invention provided anoptimized wire width for the transmission lines to produce the smallestskew for the multi-level network based on the analytical skew function.

Clock distribution methods and circuits in accordance with preferredembodiments of the invention can provide several advantages. There is nodirect feedback path from the transmission line network to the clocksource. The transmission lines are a linear network, and thus the designand optimization involve no active components. Another advantage is thatthe energy storage capability of the locked standing wave in thetransmission line can mitigate the clock jitter. Additionally, powerconsumption the network is low as a result of the resonance effect ofthe transmission line.

Preferred embodiments of the invention will now be discussed withrespect to the drawings, while artisans will appreciate broader aspectsof the invention from the discussion of the preferred embodiments.Schematic drawings are used, and will be understood by artisans. In thepreferred embodiments, differential sinusoidal waves are used for globalclock distribution. The sinusoidal waveform simplifies the analysis ofresonance phenomena of the transmission line, permitting implementationof optimization methods of the invention. In addition, the differentialsignals provide a well-controlled current return loop, to thus improvethe predictability of inductance value.

In a VLSI implementation, the distributed sine wave clock signals willhave to be converted locally to square wave signals. A clock driver maybe used for this conversion. Such an exemplary driver has two stages. Anexemplary clock driver for conversion is described, for example, inO'Mahony et al., “Design of a 10 GHz Clock Distribution Network UsingCoupled Standing-Wave Oscillators,” DAC 2003, pp. 682-687, June 2003.

FIG. 1 is a block diagram illustrating a two-stage clock driver forlocal conversion of sine wave to a square wave based upon O'Mahony etal. A first stage differential transistor pair 10 includes a smallgate-overdrive for complete current switching. It amplifies and limitsthe signal so the output amplitude is roughly independent of the inputamplitude. A low-pass filter 12 attenuates the harmonics added by thelimiting amplifier that would otherwise cause amplitude-dependent skew.A sine-to-square, converter 14 forms a second stage. As indicated inO'Mahony, use of cross-coupled inverters and a shunt resistor in thesine-to-square converter can achieve a well-controlled 50% duty cycleover process, temperature, frequency, and supply variation. This type oftwo-stage clock driver can achieve below 1 ps amplitude-dependant skew.

In the following discussion of preferred embodiments, and particularly,the discussion of optimized transmission line wire widths in preferredembodiments, a simple linear variation model is used to represent thesystematic spatial variations on wire widths and transistor lengths. Forany location (x, y) on the chip, the actual geometrical parameterd=d₀+k_(x)x+k_(y)y, where d₀ is the nominal parameter and k_(x), k_(y)are the horizontal, vertical variation coefficient, respectively. Themaximum variations across the chip are assumed to be +10% of the idealvalue. This “pseudo-deterministic” linear variation model can beregarded as a “worst case” scenario of the probabilistic variations.This simple model can be replaced with more sophisticated variationmodels when implementing wire width optimizations in accordance with theinvention, as will be appreciated by artisans. When analyzing clock skewlevels for preferred embodiments and optimizations, the supply voltagefluctuation is taken into account. Specifically, it is assumed that thesupply voltages are a set of independent random variables within +10% ofa nominal V_(dd) value.

FIG. 2A shows an H-tree clock distribution network 16 and FIG. 2B showsa transmission line shunt network 18 for use with the H-tree clockdistribution 16 network of FIG. 2A. The figures are presented separatelyfor clarity, as an overlay of the two figures hides the structure of theH-tree network. The H-tree network includes a plurality of clock drivers20 _(N), each of which belong to one of three levels in the H-treenetwork. A number of the drivers from each level are labelled as either20 ₁, 20 ₂, or 20 ₃, and the same drivers that are labelled in FIG. 2Aare also labelled in the transmission line shunt network 18 of FIG. 2B.Each of three differential transmission lines 22 ₁, 22 ₂, and 22 ₃shunts clock drivers 20 _(N) in a corresponding level of the H-treeclock distribution network 16.

The natural frequency shunt wires in the differential transmission lines22 ₁, 22 ₂, and 22 ₃ shunts are sized to reduce the skew between clockdrivers 20 ₁, 20 ₂, 20 ₃. The transmission lines 22 ₁, 22 ₂, and 22 ₃are arranged in hierarchical transmission line spirals. Each spiralconsists of a pair of multiple wavelength long coplanar differentialpair 26, including separate conductors 26, (clock +) and 26 ₂ (clock −)disposed relative a ground plane 28, as shown in FIG. 3. The spiralshape of the transmission lines 22 ₁, 22 ₂, and 22 ₃ results from thelayout of the clock distribution network. Other networks can producedifferent shapes. Arbitrary shaped transmission line shunt networks maybe utilized, however, if a necessary condition is met. The necessarycondition to meet is that the distance of transmission lines betweenclock drivers is an integral multiple of the wavelength of the clocksignal being distributed.

Clock drivers 20 _(N) are evenly distributed on every spiral and theseparation between two neighboring clock drivers is one wavelength. TheH-tree network 16 distributes sinusoidal clock signals from a centralclock source 30 at its center which would be, for example, the center ofa VLSI chip) to all the clock drivers 20 _(N). The signal arriving timeof all the clock drivers 20 _(N) on a common differential transmissionline 22 _(N) of the shunt network 18. In a VLSI implementation, each ofthe lowest level clock drivers 20, would connect to a local distributiontree or mesh 34, as shown in FIG. 4, to send clock signals from theclock drivers on the lowest level spiral to innumerous clocking elements36 in the VLSI circuit.

The transmission lines 22 ₁, 22 ₂, and 22 ₃ in the transmission lineshunt network 18 may be optimized. Variations in the sizes, relativedistances, etc. of the differential pairs 26 that make up thetransmission lines 22 ₁, 22 ₂, and 22 ₃ in can be set to achieve variouslevels of skew. Minimized skew is produced in preferred embodiments,while designers may implement less than optimal shunt line networks 18in accordance with the invention while still achieving significantadvantages.

Outlining the design approach for transmission line shunt networks 18 ofthe invention will provide artisans with the ability to account fortrade-offs in particular VLSI implementations. For example, for the sameamount of routing area, assigning clock drivers to spirals at differentlevels can have a different impact on clock skew. In the following, anoptimal way to distribute the routing resources to the spirals atdifferent levels of the shunt network 18 such that the minimum skew isachieved on the lowest level spiral with given routing area budget willbe discussed.

The optimization problem is addressed as a transmission lines 22 ₁, 22₂, and 22 ₃ spirals sizing problem. It is assumed that there is a spiralnetwork applied to an H-tree as in the embodiment of FIGS. 2A and 2B. Itis assume that the total routing area is constrained. The goal is tominimize the skew on the lowest level of the clock distribution network16, namely, at the drivers 20 ₁. The optimum wire width w_(i) of spiralsat level i, for i=1 to n, will be determined so that clock skew isminimized.

A simplified circuit model for the transmission lines 22 ₁, 22 ₂ and 22₃ is shown in FIG. 5 to study the skew reduction mechanism of a onewavelength long transmission line shunt. In FIG. 5 two clock drivers 20_(N) with driving resistance R_(s) and input phase shift (skew) Φ areconnected by an RLGC (resistance, inductance, conductance, capacitance)transmission line 22 _(N) of exactly one wave length long. The output attwo separated terminals, V₁ and V₂ are synchronized by the shunttransmission line 22 _(N).

FIG. 6 shows simulated wave forms for the circuit model of FIG. 5. If itis assumed that an input skew Φ between input voltages V_(s1) and V_(s2)is 30 degrees, the resultant skew between output voltages V₁ and V₂ isonly 0.7 degree. In FIG. 6, the two larger magnitude curves are theskewed input voltages V_(s1) and V_(s2). The two smaller magnitudecurves, aligned to a high precision are the output voltages V₁ and V₂.Assume that input skew is small and R<ωL, (where L is the inductance ofthe shunt, R is the resistance of the shunt, and ω is the clockfrequency) by superposition of all possible traveling and standing wavesin the transmission line the following skew expression is obtained.$\begin{matrix}{{\Delta\phi} = {\frac{1 - {\mathbb{e}}^{- \frac{\pi\quad R}{\omega\quad L}}}{1 + {\mathbb{e}}^{- \frac{\pi\quad R}{\omega\quad L}}}\phi}} & (1)\end{matrix}$Spice simulations have been used to validate equation (1). From the skewequation (1), it is apparent that when resistance R approaches zero, thetransmission line becomes lossless. As a result, ΔΦ,the phase shiftbetween voltages V₁ and V₂, also approaches zero. Two clock drivers getfully synchronized. When R approaches infinity, nodes 1 and 2 are open,at which point there is no shunt effect and the phase shift betweennodes 1 and 2 remains the input skew, Φ.

An equation to model the skew expression to characterize the shunteffect of multiple clock drivers connected to a transmission line mayalso be derived under following assumptions: i) the transmission line isinfinitely long and the clock drivers are spaced evenly on thetransmission line with separation of one wavelength; ii) the input phaseof each voltage source to be a random number uniformly distributed in[0, Φ]. Because it is an infinitely long line, it can be assumed thatthere are two nodes a, b having exact phase 0 and Φ, respectively. Then,it is possible to compute the expected phase of these two points, andtake the difference of the expectations as the skew.

Assume the driving resistance is much larger than the characteristicimpedance of the transmission line and the input skew is small. Using asimilar technique in the derivation of equation (1), the following skewequation is obtained. $\begin{matrix}{{\Delta\phi} = {\frac{1 - {\mathbb{e}}^{- \frac{3\pi\quad R}{\omega\quad L}}}{1 + {\mathbb{e}}^{- \frac{3\pi\quad R}{\omega\quad L}}}\phi}} & (2)\end{matrix}$

An optimum area for transmission lines may now be determined. To providean example, it is assumed that a pair of coplanar copper transmissionlines is used to construct a spiral shunt. The two parallel differentialwires have height 240 nm, and the same width w. The separation betweenthem is 2 um, and the wires are 3.5 um above a ground plane. Typicalvalue of w ranges from 0.5 to 40 um.

The fast field solver was used to get the frequency dependantresistance, R, and inductance, L. Linear regression is used to get therelation between resistance/inductance ratio, R/L, and wire width, w.The R/L-1/w relation displays excellent linearity.

The skew function of each level of the spiral shunt network (modelingthe network 18 of FIG. 2B) may be rewritten as $\begin{matrix}{{\Delta\phi} = {\frac{1 - {c_{i}{\mathbb{e}}^{- \frac{k_{i}}{\omega_{i}}}}}{1 + {c_{i}{\mathbb{e}}^{- \frac{k_{i}}{\omega_{i}}}}}\phi}} & (4)\end{matrix}$Where, w_(i) is the width of the ith level spiral and c_(i), k_(i) areconstants for level i spiral. The optimal spiral sizing problem iswritten as the following mathematical programming:

Min: $\begin{matrix}{{\left. {{\Delta\phi} = {{\left( {{\left( {\left( {\phi_{1}\frac{1 - {c_{1}{\mathbb{e}}^{- \frac{k_{i}}{w_{i}}}}}{1 + {c_{1}{\mathbb{e}}^{- \frac{k_{i}}{w_{i}}}}}} \right) + \phi_{2}} \right)\frac{1 - {c_{2}{\mathbb{e}}^{- \frac{k_{2}}{w_{2}}}}}{1 + {c_{2}{\mathbb{e}}^{- \frac{k_{2}}{w_{2}}}}}} + \phi_{3}} \right)\ldots}\quad + \phi_{n}}} \right)\frac{1 - {c_{n}{\mathbb{e}}^{- \frac{k_{n}}{w_{n}}}}}{1 + {c_{n}{\mathbb{e}}^{- \frac{k_{n}}{w_{n}}}}}}{{{s.t.\text{:}}\quad{\sum\limits_{i = 1}^{n}{l_{i}w_{i}}}} = A}} & (5)\end{matrix}$In the programming (5), If is the skew of signal propagation from leveli-l to level i spiral. L_(i) and w_(i) are length and width of thespiral of level i. The object is to minimize skew under the maximumrouting area constraint A.

The following lemma has been proved.${{{{Lemma}\text{:}\quad{f(w)}} = {{\frac{1 - {c\quad{\mathbb{e}}^{{- k}/w}}}{1 + {c\quad{\mathbb{e}}^{{- k}/w}}}\quad{is}\quad a\quad{convex}\quad{function}\quad{on}\quad w} \in \left\lbrack {\frac{k}{2},\infty} \right)}},}\quad$where, k is a positive constant. The above lemma suggests that, when thewire of the transmission line is wide enough, the skew-wire-widthrelation is convex. In order to make the programming convex, a set ofminimal wire width constraints may be imposed upon each level spiral.

In experiments that were conducted, the minimal wire widths of eachlevel mesh were set as 0.6 um, 1.3 um, 1.3 um (lowest to highest level).With the minimal wire width constraints for each level spiral, thefollowing convex program is obtained.

Min: $\begin{matrix}{{\left. {{\Delta\phi} = {{\left( {{\left( {\left( {\phi_{1}\frac{1 - {c_{1}{\mathbb{e}}^{- \frac{k_{i}}{w_{i}}}}}{1 + {c_{1}{\mathbb{e}}^{- \frac{k_{i}}{w_{i}}}}}} \right) + \phi_{2}} \right)\frac{1 - {c_{2}{\mathbb{e}}^{- \frac{k_{2}}{w_{2}}}}}{1 + {c_{2}{\mathbb{e}}^{- \frac{k_{2}}{w_{2}}}}}} + \phi_{3}} \right)\ldots}\quad + \phi_{n}}} \right)\frac{1 - {c_{n}{\mathbb{e}}^{- \frac{k_{n}}{w_{n}}}}}{1 + {c_{n}{\mathbb{e}}^{- \frac{k_{n}}{w_{n}}}}}}{{{s.t.\text{:}}\quad{\sum\limits_{i = 1}^{n}{l_{i}w_{i}}}} = A}{{w_{i} > m_{i}},{\forall{{\mathbb{i}} \in \left( {1,2,\ldots\quad,n} \right)}}}} & (6)\end{matrix}$

Due to the convex property of the program (6), the following theorem isobtained.

Theorem: The local optimum of the programming (6) is the global optimum.

According to the above theorem, many numerical methods such as gradientdescendant and line search methods can be adopted to solve this class ofprogramming. In example experiments, the programs were solved using theoptimization package of MATLAB. The example experimental results arepresented.

EXPERIMENTAL RESULTS

In the experiments, the chip size was set to be 2 cm by 2 cm, and athree level spiral (like that shown in FIG. 2B was used to shunt clocksignals). The clock frequency is 10.336 GHz. The wave length is exactly1 cm. Each of the spirals had 4, 9, and 17 clock drivers respectively. Abalanced H-tree was synthesized to distribute clock signal from thecenter of the chip to the clock drivers. The designed arriving time ofall drivers on the same level spiral is equal. With given processvariations model, simulations obtained the worst skew of the signalpropagation from one level to the next level based on SPICE simulation.These skews were used as the values of Φ_(i) in the convex programming.Routing area was normalized to the area of bottom level spiral with 1 umwire width. TABLE 1 Optimized wire width of each level spiral for 3level spiral Total W1 W2 W3 Skew M Skew S Impr. Area (um) (um) (um) (ps)(ps) (%) 0 0 0 0 23.15 23.15  0% 0.5 1.7 0 0 17.796 20.50 13% 1 1.93081.0501 0 12.838 14.764 13% 3 2.5751 1.3104 1.3294 8.6087 8.7309 15% 52.9043 3.7559 2.3295 6.2015 6.3169 16% 10 3.1919 4.5029 6.8651 4.27555.2131 18% 15 3.6722 6.1303 10.891 2.4917 3.5182 29% 20 4.0704 7.500115.072 1.7070 2.6501 37% 25 4.4040 8.6979 19.359 1.2804 2.1243 40%

Table 1 lists the optimized wire width of each level spiral fordifferent total routing area. W1, W2, and W3 are optimal wire widths oflevel 1, level 2 and level 3 spirals, respectively. For the comparisonreason, we also simulate the skew on a single-level spiral network,which only uses bottom level spiral to shunt all the leaf nodes of theH-tree. We let the single level spiral network has same total routingarea of the multi-level spirals network. Column 5 and 6 are the skews ofmulti-level spirals and single level spiral. Column 7 shows the skewimprovement of multi-level spirals over single level spiral. When totalrouting area is small, the optimal configurations prefer to allocaterouting resources to the higher level mesh. With gradually increasing ofthe routing area, more resources are allocated to the bottom level mesh.Comparing with the single-level spiral, optimized multi-level spiral canreduce the skew by 40%.

Simulations also compared power consumption of an optimized multilevelspiral network and that of single level spiral. In Table 2, the firstrow are the total routing areas of the multi-level spirals; the secondrow and the third row list the power consumption of the multilevelspiral and single level spiral with given amount of total routing area.The simulated results show that multilevel spiral can reduce the powerconsumption by 81%. TABLE 2 Power Consumption Comparisons Area 3 4 5 710 15 20 25 PM(mw) 0.4 0.5 0.7 0.9 1.0 1.4 1.5 1.6 PS(mw) 0.83 1.5 2.12.64 3.04 4.7 7.2 8.3 reduce(%) 48 67 67 66 67 70 79 81

The robustness of optimized spirals network against supply voltagefluctuations was also tested in simulations. For the test the supplyvoltage of every clock driver was perturbed independently by a randomnumber within 10% of its nominal value. 5 experiments were performed oneach network. The worst case skew and average case skew are shown inTable 3. The skew of optimized multilevel spiral and single level spiralnetworks is compared. The last column of Table 3 lists the improvementof the average case skew. Multilevel spiral network improves the skew byup to 55%. TABLE 3 Skew in the presence of voltage variations Skew-SSkew-M Area Ave. Worst Ave. Worst Impr (%) 0 28.4 36.5 28.4 36.5  0% 39.75 12.33 8.75 9.07 11% 5 7.32 9.06 6.55 6.91 12% 10 6.31 805 4.41 5.4130% 15 5.03 7.33 2.81 4.93 44% 25 3.83 4.61 1.72 3.06 55%

When the clock frequency deviates from its nominal value or theelectrical length of transmission lines varies from integral multiple ofthe wavelength, the resonance phenomena of the transmission line shuntsdiminishes. As a result, the synchronization capabilities oftransmission line shunts degrade accordingly. The frequency responseproperties of the multilevel clock network of FIGS. 2A and 2B was alsotested by simulation. The wire width of the lowest level transmissionline was set to be 5 um wide and the clock rate to be 10.33 GHz. The −3db bandwidth of the output voltages was 0.42 GHz. At 10.33 GHz, aminimal skew of 1.38 degrees is achieved. In the frequency range of 10.2GHz to 10.5 GHz, the skew lies between 2.5 degrees and 1.38 degrees.

While specific embodiments of the present invention have been shown anddescribed, it should be understood that other modifications,substitutions and alternatives are apparent to one of ordinary skill inthe art. Such modifications, substitutions and alternatives can be madewithout departing from the spirit and scope of the invention, whichshould be determined from the appended claims.

Various features of the invention are set forth in the appended claims.

1. A VLSI clock distribution circuit, comprising: a clock distributiontree having multiple levels, a plurality of drivers in each of thelevels having a substantially similar distance from the center of theclock distribution tree; and at least one set of differentialtransmission lines, the set of differential transmission linesconnecting drivers in a common level of the clock distribution tree, thelength of the differential transmission lines between drivers being anintegral multiple of the wavelength of a clock signal being distributedby said clock distribution tree.
 2. The circuit of claim 1, wherein saidat least one set of differential transmission lines comprises aplurality of sets of differential transmission lines.
 3. The circuit ofclaim 2, wherein the length of the differential transmission linesbetween drivers is equal to one wavelength of the clock signal beingdistributed by said clock distribution tree.
 4. The circuit of claim 3,wherein said clock distribution tree comprises an H-tree that receivesthe clock signal being distributed at its center, and each of saidplurality of sets of differential transmission lines comprises a spiralthat connects drivers on a common level of the H-tree.
 5. The circuit ofclaim 4, wherein widths of plurality of sets of transmission lines areoptimized to minimize skew between drivers in said clock distributiontree.
 6. The circuit of claim 2, wherein said clock distribution treecomprises an H-tree that receives the clock signal being distributed atits center, and each of said plurality of sets of differentialtransmission lines comprises a spiral that connects drivers on a commonlevel of the H-tree.
 7. The circuit of claim 6, wherein widths ofplurality of sets of transmission lines are optimized to minimize skewbetween drivers in said clock distribution tree.
 8. The circuit of claim7, further comprising a clock source providing said clock signal as asinusoidal clock signal at the center of said clock distribution tree.9. The circuit of claim 8, wherein drivers connected to a lowest levelspiral of said plurality of sets of differential lines comprise sine tosquare wave converters.
 10. The circuit of claim 9, further comprising alocal distribution network receiving square wave clock signals from saiddrivers connected to the lowest spiral.
 11. A VLSI clock distributioncircuit, comprising: clock distribution tree means for distributing aclock signal from a clock source among clock drivers in a VLSI circuit;transmission line shunt network means for reducing skew between theclock drivers.
 12. The circuit of claim 11, wherein said transmissionline shunt means optimally reduce skew between the clock drivers.
 13. Amethod for distributing clock signals in a VLSI circuit, the methodcomprising steps of: distributing sinusoidal clock signals among clockdrivers in the VLSI circuit through a multi-level clock distributiontree; shunting clock drivers in each common level of the clockdistribution tree with a differential transmission line, wherein thelength of the differential transmission line between each clock driveris an integral multiple of the clock signals.